AITopics | diabetes dataset

Collaborating Authors

diabetes dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring and Interacting with the Set of Good Sparse Generalized Additive Models

Neural Information Processing SystemsFeb-16-2026, 14:41:36 GMT

The Rashomon set is the set of models that are approximately as good as the best model in the class. That is, it is the set of all good models.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Does the Model Say What the Data Says? A Simple Heuristic for Model Data Alignment

Salgado, Henry, Kendall, Meagan R., Ceberio, Martine

arXiv.org Artificial IntelligenceDec-9-2025

In this work, we propose a simple and computationally efficient framework for evaluating whether machine learning models align with the structure of the data they learn from; that is, whether the model says what the data says. Unlike existing interpretability methods that focus exclusively on explaining model behavior, our approach establishes a baseline derived directly from the data itself. Drawing inspiration from Rubin's Potential Outcomes Framework, we quantify how strongly each feature separates the two outcome groups in a binary classification task, moving beyond traditional descriptive statistics to estimate each feature's effect on the outcome. By comparing these data-derived feature rankings with model-based explanations, we provide practitioners with an interpretable and model-agnostic method for assessing model-data alignment.

alignment, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.21931

Country:

North America > United States > Texas (0.15)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.98)
Health & Medicine > Diagnostic Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CID: Measuring Feature Importance Through Counterfactual Distributions

Conti, Eddie, Parafita, Álvaro, Brando, Axel

arXiv.org Artificial IntelligenceDec-5-2025

Assessing the importance of individual features in Machine Learning is critical to understand the model's decision-making process. While numerous methods exist, the lack of a definitive ground truth for comparison highlights the need for alternative, well-founded measures. This paper introduces a novel post-hoc local feature importance method called Counterfactual Importance Distribution (CID). We generate two sets of positive and negative counterfactuals, model their distributions using Kernel Density Estimation, and rank features based on a distributional dissimilarity measure. This measure, grounded in a rigorous mathematical framework, satisfies key properties required to function as a valid metric. We showcase the effectiveness of our method by comparing with well-established local feature importance explainers. Our method not only offers complementary perspectives to existing approaches, but also improves performance on faithfulness metrics (both for comprehensiveness and sufficiency), resulting in more faithful explanations of the system. These results highlight its potential as a valuable tool for model analysis.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.15371

Genre: Research Report (1.00)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Quantum Machine Learning in Healthcare: Evaluating QNN and QSVM Models

Tudisco, Antonio, Volpe, Deborah, Turvani, Giovanna

arXiv.org Artificial IntelligenceDec-2-2025

Effective and accurate diagnosis of diseases such as cancer, diabetes, and heart failure is crucial for timely medical intervention and improving patient survival rates. Machine learning has revolutionized diagnostic methods in recent years by developing classification models that detect diseases based on selected features. However, these classification tasks are often highly imbalanced, limiting the performance of classical models. Quantum models offer a promising alternative, exploiting their ability to express complex patterns by operating in a higher-dimensional computational space through superposition and entanglement. These unique properties make quantum models potentially more effective in addressing the challenges of imbalanced datasets. This work evaluates the potential of quantum classifiers in healthcare, focusing on Quantum Neural Networks (QNNs) and Quantum Support Vector Machines (QSVMs), comparing them with popular classical models. The study is based on three well-known healthcare datasets -- Prostate Cancer, Heart Failure, and Diabetes. The results indicate that QSVMs outperform QNNs across all datasets due to their susceptibility to overfitting. Furthermore, quantum models prove the ability to overcome classical models in scenarios with high dataset imbalance. Although preliminary, these findings highlight the potential of quantum models in healthcare classification tasks and lead the way for further research in this domain.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN64981.2025.11227750

2505.20804

Genre: Research Report > Experimental Study (0.35)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.91)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

Add feedback

Multi-VQC: A Novel QML Approach for Enhancing Healthcare Classification

Tudisco, Antonio, Volpe, Deborah, Turvani, Giovanna

arXiv.org Artificial IntelligenceDec-2-2025

--Accurate and reliable diagnosis of diseases is crucial in enabling timely medical treatment and enhancing patient survival rates. In recent years, Machine Learning has revolutionized diagnostic practices by creating classification models capable of identifying diseases. However, these classification problems often suffer from significant class imbalances, which can inhibit the effectiveness of traditional models. Therefore, the interest in Quantum models has arisen, driven by the captivating promise of overcoming the limitations of the classical counterpart thanks to their ability to express complex patterns by mapping data in a higher-dimensional computational space. This work proposes a novel approach for enhancing the classification performance of Quantum Neural Networks (QNN) consisting of multiple V ariational Quantum Circuits (VQCs) arranged sequentially. This strategy increases the nonlinearity of the model by exploiting the measurement operation and improving its ability to capture complex patterns.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/QSW67625.2025.00013

2505.20797

Country: Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

b1719f44953c2e0754a016ab267fe4e7-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 05:06:07 GMT

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Evaluating Angle and Amplitude Encoding Strategies for Variational Quantum Machine Learning: their impact on model's accuracy

Tudisco, Antonio, Marchesin, Andrea, Zamboni, Maurizio, Graziano, Mariagrazia, Turvani, Giovanna

arXiv.org Artificial IntelligenceAug-5-2025

Recent advancements in Quantum Computing and Machine Learning have increased attention to Quantum Machine Learning (QML), which aims to develop machine learning models by exploiting the quantum computing paradigm. One of the widely used models in this area is the Variational Quantum Circuit (VQC), a hybrid model where the quantum circuit handles data inference while classical optimization adjusts the parameters of the circuit. The quantum circuit consists of an encoding layer, which loads data into the circuit, and a template circuit, known as the ansatz, responsible for processing the data. This work involves performing an analysis by considering both Amplitude- and Angle-encoding models, and examining how the type of rotational gate applied affects the classification performance of the model. This comparison is carried out by training the different models on two datasets, Wine and Diabetes, and evaluating their performance. The study demonstrates that, under identical model topologies, the difference in accuracy between the best and worst models ranges from 10% to 30%, with differences reaching up to 41%. Moreover, the results highlight how the choice of rotational gates used in encoding can significantly impact the model's classification performance. The findings confirm that the embedding represents a hyperparameter for VQC models.

angle and amplitude encoding strategy, artificial intelligence, quantum machine learning, (12 more...)

arXiv.org Artificial Intelligence

2508.00768

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.35)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Glucose-ML: A collection of longitudinal diabetes datasets for development of robust AI solutions

Prioleau, Temiloluwa, Lu, Baiying, Cui, Yanjun

arXiv.org Artificial IntelligenceJul-21-2025

Artificial intelligence (AI) algorithms are a critical part of state-of-the-art digital health technology for diabetes management. Yet, access to large high-quality datasets is creating barriers that impede development of robust AI solutions. To accelerate development of transparent, reproducible, and robust AI solutions, we present Glucose-ML, a collection of 10 publicly available diabetes datasets, released within the last 7 years (i.e., 2018 - 2025). The Glucose-ML collection comprises over 300,000 days of continuous glucose monitor (CGM) data with a total of 38 million glucose samples collected from 2500+ people across 4 countries. Participants include persons living with type 1 diabetes, type 2 diabetes, prediabetes, and no diabetes. To support researchers and innovators with using this rich collection of diabetes datasets, we present a comparative analysis to guide algorithm developers with data selection. Additionally, we conduct a case study for the task of blood glucose prediction - one of the most common AI tasks within the field. Through this case study, we provide a benchmark for short-term blood glucose prediction across all 10 publicly available diabetes datasets within the Glucose-ML collection. We show that the same algorithm can have significantly different prediction results when developed/evaluated with different datasets. Findings from this study are then used to inform recommendations for developing robust AI solutions within the diabetes or broader health domain. We provide direct links to each longitudinal diabetes dataset in the Glucose-ML collection and openly provide our code.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.14077

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Applied AI (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

A Comparative Study of Machine Learning Techniques for Early Prediction of Diabetes

Alzboon, Mowafaq Salem, Al-Batah, Mohammad, Alqaraleh, Muhyeeddin, Abuashour, Ahmad, Bader, Ahmad Fuad

arXiv.org Artificial IntelligenceJun-13-2025

-- In many nations, diabetes is becoming a significant health problem, and early identi - fication and control are crucial. Using machine learning algorithms to predict diabetes has yielded encouraging results. Using the Pima Indians Dia - betes dataset, this study attempts to evaluate the efficacy of several machine - learning methods for diabetes prediction. The collection includes infor - mation on 768 patients, such as their ages, BMIs, and glucose levels. The techniques assessed are Logistic Regression, Decision Tree, Random Forest, k - Nearest Neighbors, Naive Bayes, Support Vector Machine, Gradient Boosting, and Neural Network. The findings indicate that the Neural Network algorithm performed the best, with an accuracy of 78.57 The study implies that machine learning algorithms can aid diabetes prediction and be an efficient early detection tool. Diabetes is a chronic metabolic disease af - fecting millions worldwide and is a significant cause of morbidity and death [1]. High blood glucose levels characterize the disorder and can result in some complications, including cardiovascular disease, stroke, blindness, and amputations. To prevent or postpone com - plications, diabetes must be recognized and treated as soon as feasible; however, this can be challenging because symptoms may be mild or absent [2]. Machine learning (ML) is a subfield of artificial intelligence that comprises the de - velopment of algorithms that can learn from data and generate inferences or predictions without being explicitly programmed. ML algorithms are beneficial in several fields, in - cluding healthcare.

artificial intelligence, diabetes, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ComNet60156.2023.10366688

2506.1018

Country: Asia > Middle East > Jordan (0.15)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.90)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation

He, Xinrui, Ban, Yikun, Zou, Jiaru, Wei, Tianxin, Cook, Curtiss B., He, Jingrui

arXiv.org Artificial IntelligenceJan-4-2025

Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation, making them a promising tool for data imputation. However, challenges persist in designing effective prompts for a finetuning-free process and in mitigating the risk of LLM hallucinations. To address these issues, we propose a novel framework, LLM-Forest, which introduces a "forest" of few-shot learning LLM "trees" with confidence-based weighted voting, inspired by ensemble learning (Random Forest). This framework is established on a new concept of bipartite information graphs to identify high-quality relevant neighboring entries with both feature and value granularity. Extensive experiments on 9 real-world datasets demonstrate the effectiveness and efficiency of LLM-Forest.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.2152

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback